ROL_MajExp__taxprofiler__042225_1

Loading report..

Highlight Samples

Regex mode off

    Rename Samples

    Click here for bulk input.

    Paste two columns of a tab-delimited table here (eg. from Excel).

    First column should be the old name, second column the new name.

    Regex mode off

      Show / Hide Samples

      Regex mode off

        Export Plots

        px
        px
        X

        Download the raw data used to create the plots in this report below:

        Note that additional data was saved in multiqc_data when this report was generated.


        Choose Plots

        If you use plots from MultiQC in a publication or presentation, please cite:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        Save Settings

        You can save the toolbox settings for this report to the browser.


        Load Settings

        Choose a saved report profile from the dropdown box below:

        Tool Citations

        Please remember to cite the tools that you use in your analysis.

        To help with this, you can download publication details of the tools mentioned in this report:

        About MultiQC

        This report was generated using MultiQC, version 1.25.1

        You can see a YouTube video describing how to use MultiQC reports here: https://youtu.be/qPbIlO_KWN0

        For more information about MultiQC, including other videos and extensive documentation, please visit http://multiqc.info

        You can report bugs, suggest improvements and find the source code for MultiQC on GitHub: https://github.com/MultiQC/MultiQC

        MultiQC is published in Bioinformatics:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        ROL_MajExp__taxprofiler__042225_1

        A modular tool to aggregate results from bioinformatics analyses across many samples into a single report.

        This report has been generated by the nf-core/taxprofiler analysis pipeline. For information about how to interpret these results, please see the documentation.

        Report generated on 2025-04-22, 21:26 PDT based on data in: /nfs3/Sharpton_Lab/prod/prod_restructure/projects/sielerjm/ZF__MajorExp/Databases/Deworming/Nextflow/work/ca/f69cb119076f80a0ee129468787b65


        General Statistics

        By default, all read count columns are displayed as millions (M) of reads.
        Showing 0/42 rows and 7/14 columns.
        Sample Name% AlignedError rateNon-primaryReads mapped% Mapped% Proper pairs% MapQ 0 readsTotal seqsDanio rerioTop 5 speciesUnclassifiedDanio rerioTop 5 speciesUnclassified
        TS047_RoL_RNA_16_TS047_RoL_RNA_16
        75.3%
        2.21%
        0.0M
        35.7M
        75.3%
        56.7%
        6.4%
        47.4M
        TS047_RoL_RNA_16_TS047_RoL_RNA_16_kraken2_custom_k2
        96.2%
        96.2%
        3.8%
        TS047_RoL_RNA_16_TS047_RoL_RNA_16_kraken2_custom_k2.bracken
        96.2%
        96.2%
        3.8%
        TS047_RoL_RNA_177_TS047_RoL_RNA_177
        72.3%
        2.27%
        0.0M
        34.9M
        72.3%
        51.9%
        6.5%
        48.2M
        TS047_RoL_RNA_177_TS047_RoL_RNA_177_kraken2_custom_k2
        96.3%
        96.3%
        3.7%
        TS047_RoL_RNA_177_TS047_RoL_RNA_177_kraken2_custom_k2.bracken
        96.3%
        96.3%
        3.7%
        TS047_RoL_RNA_20_TS047_RoL_RNA_20
        73.6%
        2.27%
        0.0M
        41.9M
        73.6%
        54.1%
        6.6%
        56.9M
        TS047_RoL_RNA_20_TS047_RoL_RNA_20_kraken2_custom_k2
        96.5%
        96.5%
        3.5%
        TS047_RoL_RNA_20_TS047_RoL_RNA_20_kraken2_custom_k2.bracken
        96.5%
        96.5%
        3.5%
        TS047_RoL_RNA_291_TS047_RoL_RNA_291
        72.6%
        2.26%
        0.0M
        36.4M
        72.6%
        52.8%
        6.3%
        50.2M
        TS047_RoL_RNA_291_TS047_RoL_RNA_291_kraken2_custom_k2
        95.2%
        95.2%
        4.8%
        TS047_RoL_RNA_291_TS047_RoL_RNA_291_kraken2_custom_k2.bracken
        95.2%
        95.2%
        4.8%
        TS047_RoL_RNA_328_TS047_RoL_RNA_328
        72.0%
        2.27%
        0.0M
        43.0M
        72.0%
        52.5%
        6.6%
        59.7M
        TS047_RoL_RNA_328_TS047_RoL_RNA_328_kraken2_custom_k2
        94.2%
        94.2%
        5.8%
        TS047_RoL_RNA_328_TS047_RoL_RNA_328_kraken2_custom_k2.bracken
        94.2%
        94.2%
        5.8%
        TS047_RoL_RNA_355_TS047_RoL_RNA_355
        73.3%
        2.14%
        0.0M
        29.1M
        73.3%
        56.9%
        6.3%
        39.7M
        TS047_RoL_RNA_355_TS047_RoL_RNA_355_kraken2_custom_k2
        82.7%
        82.7%
        17.3%
        TS047_RoL_RNA_355_TS047_RoL_RNA_355_kraken2_custom_k2.bracken
        82.7%
        82.7%
        17.3%
        TS047_RoL_RNA_46_TS047_RoL_RNA_46
        64.6%
        2.72%
        0.0M
        33.2M
        64.6%
        40.4%
        12.4%
        51.5M
        TS047_RoL_RNA_46_TS047_RoL_RNA_46_kraken2_custom_k2
        98.9%
        98.9%
        1.1%
        TS047_RoL_RNA_46_TS047_RoL_RNA_46_kraken2_custom_k2.bracken
        98.9%
        98.9%
        1.1%
        TS047_RoL_RNA_477_TS047_RoL_RNA_477
        72.8%
        2.24%
        0.0M
        41.4M
        72.8%
        53.1%
        6.4%
        56.8M
        TS047_RoL_RNA_477_TS047_RoL_RNA_477_kraken2_custom_k2
        95.8%
        95.8%
        4.2%
        TS047_RoL_RNA_477_TS047_RoL_RNA_477_kraken2_custom_k2.bracken
        95.8%
        95.8%
        4.2%
        TS047_RoL_RNA_498_TS047_RoL_RNA_498
        72.6%
        2.20%
        0.0M
        40.7M
        72.6%
        53.4%
        6.4%
        56.0M
        TS047_RoL_RNA_498_TS047_RoL_RNA_498_kraken2_custom_k2
        94.0%
        94.0%
        6.0%
        TS047_RoL_RNA_498_TS047_RoL_RNA_498_kraken2_custom_k2.bracken
        94.0%
        94.0%
        6.0%
        TS047_RoL_RNA_526_TS047_RoL_RNA_526
        68.7%
        2.47%
        0.0M
        51.1M
        68.7%
        46.5%
        9.4%
        74.4M
        TS047_RoL_RNA_526_TS047_RoL_RNA_526_kraken2_custom_k2
        97.7%
        97.7%
        2.3%
        TS047_RoL_RNA_526_TS047_RoL_RNA_526_kraken2_custom_k2.bracken
        97.7%
        97.7%
        2.3%
        TS047_RoL_RNA_532_TS047_RoL_RNA_532
        72.2%
        2.26%
        0.0M
        38.0M
        72.2%
        51.8%
        6.4%
        52.6M
        TS047_RoL_RNA_532_TS047_RoL_RNA_532_kraken2_custom_k2
        96.4%
        96.4%
        3.6%
        TS047_RoL_RNA_532_TS047_RoL_RNA_532_kraken2_custom_k2.bracken
        96.4%
        96.4%
        3.6%
        TS047_RoL_RNA_655_TS047_RoL_RNA_655
        68.6%
        2.42%
        0.0M
        38.4M
        68.6%
        46.1%
        8.5%
        56.0M
        TS047_RoL_RNA_655_TS047_RoL_RNA_655_kraken2_custom_k2
        97.3%
        97.3%
        2.7%
        TS047_RoL_RNA_655_TS047_RoL_RNA_655_kraken2_custom_k2.bracken
        97.3%
        97.3%
        2.7%
        TS047_RoL_RNA_708_TS047_RoL_RNA_708
        71.2%
        2.24%
        0.0M
        48.0M
        71.2%
        51.1%
        6.3%
        67.5M
        TS047_RoL_RNA_708_TS047_RoL_RNA_708_kraken2_custom_k2
        95.8%
        95.8%
        4.2%
        TS047_RoL_RNA_708_TS047_RoL_RNA_708_kraken2_custom_k2.bracken
        95.8%
        95.8%
        4.2%
        TS047_RoL_RNA_710_TS047_RoL_RNA_710
        73.1%
        2.17%
        0.0M
        38.7M
        73.1%
        54.1%
        5.9%
        53.0M
        TS047_RoL_RNA_710_TS047_RoL_RNA_710_kraken2_custom_k2
        95.3%
        95.3%
        4.7%
        TS047_RoL_RNA_710_TS047_RoL_RNA_710_kraken2_custom_k2.bracken
        95.3%
        95.3%
        4.7%

        bowtie2

        Results from both Bowtie 2 and HISAT2, tools for aligning reads against a reference genome.URL: http://bowtie-bio.sourceforge.net/bowtie2; https://ccb.jhu.edu/software/hisat2DOI: 10.1038/nmeth.1923; 10.1038/nmeth.3317; 10.1038/s41587-019-0201-4

        Paired-end alignments

        This plot shows the number of reads aligning to the reference in different ways.

        Please note that single mate alignment counts are halved to tally with pair counts properly.

        There are 6 possible types of alignment:

        • PE mapped uniquely: Pair has only one occurence in the reference genome.
        • PE mapped discordantly uniquely: Pair has only one occurence but not in proper pair.
        • PE one mate mapped uniquely: One read of a pair has one occurence.
        • PE multimapped: Pair has multiple occurence.
        • PE one mate multimapped: One read of a pair has multiple occurence.
        • PE neither mate aligned: Pair has no occurence.
        Created with MultiQC

        Samtools Stats

        Toolkit for interacting with BAM/CRAM files.URL: http://www.htslib.orgDOI: 10.1093/bioinformatics/btp352

        Percent mapped

        Alignment metrics from samtools stats; mapped vs. unmapped reads vs. reads mapped with MQ0.

        For a set of samples that have come from the same multiplexed library, similar numbers of reads for each sample are expected. Large differences in numbers might indicate issues during the library preparation process. Whilst large differences in read numbers may be controlled for in downstream processings (e.g. read count normalisation), you may wish to consider whether the read depths achieved have fallen below recommended levels depending on the applications.

        Low alignment rates could indicate contamination of samples (e.g. adapter sequences), low sequencing quality or other artefacts. These can be further investigated in the sequence level QC (e.g. from FastQC).

        Reads mapped with MQ0 often indicate that the reads are ambiguously mapped to multiple locations in the reference sequence. This can be due to repetitive regions in the genome, the presence of alternative contigs in the reference, or due to reads that are too short to be uniquely mapped. These reads are often filtered out in downstream analyses.

        Created with MultiQC

        Alignment stats

        This module parses the output from samtools stats. All numbers in millions.

        Created with MultiQC

        Kraken

        Taxonomic classification using exact k-mer matches to find the lowest common ancestor (LCA) of a given sequence.URL: https://ccb.jhu.edu/software/krakenDOI: 10.1186/gb-2014-15-3-r46

        Top taxa

        The number of reads falling into the top 5 taxa across different ranks.

        To make this plot, the percentage of each sample assigned to a given taxa is summed across all samples. The counts for these top 5 taxa are then plotted for each of the 9 different taxa ranks. The unclassified count is always shown across all taxa ranks.

        The total number of reads is approximated by dividing the number of unclassified reads by the percentage of the library that they account for. Note that this is only an approximation, and that kraken percentages don't always add to exactly 100%.

        The category "Other" shows the difference between the above total read count and the sum of the read counts in the top 5 taxa shown + unclassified. This should cover all taxa not in the top 5, +/- any rounding errors.

        Note that any taxon that does not exactly fit a taxon rank (eg. - or G2) is ignored.

        Created with MultiQC

        Bracken

        Estimates species abundances in metagenomics samples by probabilistically re-distributing reads in the taxonomic tree.URL: https://ccb.jhu.edu/software/krakenDOI: 10.7717/peerj-cs.104

        ℹ️: plot title will say Kraken2 due to the first step of bracken producing the same output format as Kraken. Abundance information is currently not supported in MultiQC.

        Top taxa

        The number of reads falling into the top 5 taxa across different ranks.

        To make this plot, the percentage of each sample assigned to a given taxa is summed across all samples. The counts for these top 5 taxa are then plotted for each of the 9 different taxa ranks. The unclassified count is always shown across all taxa ranks.

        The total number of reads is approximated by dividing the number of unclassified reads by the percentage of the library that they account for. Note that this is only an approximation, and that kraken percentages don't always add to exactly 100%.

        The category "Other" shows the difference between the above total read count and the sum of the read counts in the top 5 taxa shown + unclassified. This should cover all taxa not in the top 5, +/- any rounding errors.

        Note that any taxon that does not exactly fit a taxon rank (eg. - or G2) is ignored.

        Created with MultiQC

        Software Versions

        Software Versions lists versions of software tools extracted from file contents.

        GroupSoftwareVersion
        BOWTIE2_ALIGNbowtie22.5.2
        pigz2.6
        samtools1.18
        BOWTIE2_BUILDbowtie22.5.2
        KRAKEN2_KRAKEN2kraken22.1.3
        pigz2.8
        KRAKENTOOLS_COMBINEKREPORTS_KRAKENcombine_kreports.py1.2
        KRAKENTOOLS_KREPORT2KRONAkreport2krona.py1.2
        KRONA_CLEANUPsed4.7
        KRONA_KTIMPORTTEXTkrona2.8.1
        MINIMAP2_INDEXminimap22.28-r1209
        SAMTOOLS_INDEXsamtools1.2
        Samtools Statssamtools1.2
        TAXPASTA_MERGEtaxpasta0.7.0
        WorkflowNextflow24.10.4
        nf-core/taxprofilerv1.2.2

        nf-core/taxprofiler Methods Description

        Suggested text and references to use when describing pipeline usage within the methods section of a publication.URL: https://github.com/nf-core/taxprofiler

        Methods

        Data was processed using nf-core/taxprofiler v1.2.2 (doi: 10.1101/2023.10.20.563221) of the nf-core collection of workflows (Ewels et al., 2020), utilising reproducible software environments from the Bioconda (Grüning et al., 2018) and Biocontainers (da Veiga Leprevost et al., 2017) projects.

        The pipeline was executed with Nextflow v24.10.4 (Di Tommaso et al., 2017) with the following command:

        nextflow run /local/cqls/software/nextflow/assets/nf-core-taxprofiler_1.2.2/1_2_2 -profile singularity -config nextflow.config -resume -params-file nf-params.json

        Tools used in the workflow included: Sequencing quality control with FastQC (Andrews 2010). Host read removal was performed for short reads with Bowtie2 (Langmead and Salzberg 2012) and SAMtools (Danecek et al. 2021). Host read removal was performed for long reads with minimap2 (Li et al. 2018) and SAMtools (Danecek et al. 2021). Taxonomic classification or profiling was carried out with: Bracken (Lu et al. 2017), Kraken2 (Wood et al. 2019). Visualisation of results, where supported, was performed with Krona (Ondov et al. 2011). Standardisation of taxonomic profiles was carried out with TAXPASTA (Beber et al. 2023). Pipeline results statistics were summarised with MultiQC (Ewels et al. 2016).

        References

        • Di Tommaso, P., Chatzou, M., Floden, E. W., Barja, P. P., Palumbo, E., & Notredame, C. (2017). Nextflow enables reproducible computational workflows. Nature Biotechnology, 35(4), 316-319. doi: 10.1038/nbt.3820
        • Ewels, P. A., Peltzer, A., Fillinger, S., Patel, H., Alneberg, J., Wilm, A., Garcia, M. U., Di Tommaso, P., & Nahnsen, S. (2020). The nf-core framework for community-curated bioinformatics pipelines. Nature Biotechnology, 38(3), 276-278. doi: 10.1038/s41587-020-0439-x
        • Grüning, B., Dale, R., Sjödin, A., Chapman, B. A., Rowe, J., Tomkins-Tinch, C. H., Valieris, R., Köster, J., & Bioconda Team. (2018). Bioconda: sustainable and comprehensive software distribution for the life sciences. Nature Methods, 15(7), 475–476. doi: 10.1038/s41592-018-0046-7
        • da Veiga Leprevost, F., Grüning, B. A., Alves Aflitos, S., Röst, H. L., Uszkoreit, J., Barsnes, H., Vaudel, M., Moreno, P., Gatto, L., Weber, J., Bai, M., Jimenez, R. C., Sachsenberg, T., Pfeuffer, J., Vera Alvarez, R., Griss, J., Nesvizhskii, A. I., & Perez-Riverol, Y. (2017). BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics (Oxford, England), 33(16), 2580–2582. doi: 10.1093/bioinformatics/btx192
        • Stamouli, S., Beber, M. E., Normark, T., Christensen, T. A., Andersson-Li, L., Borry, M., Jamy, M., nf-core community, & Fellows Yates, J. A. (2023). nf-core/taxprofiler: Highly parallelised and flexible pipeline for metagenomic taxonomic classification and profiling. (Preprint). bioRxiv 2023.10.20.563221. doi: 10.1101/2023.10.20.563221
        • Andrews S. (2010) FastQC: A Quality Control Tool for High Throughput Sequence Data, URL: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
        • Langmead, B., & Salzberg, S. L. (2012). Fast gapped-read alignment with Bowtie 2. Nature Methods, 9(4), 357–359. 10.1038/nmeth.1923
        • Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics , 34(18), 3094–3100. 10.1093/bioinformatics/bty191
        • Danecek, P., Bonfield, J. K., Liddle, J., Marshall, J., Ohan, V., Pollard, M. O., Whitwham, A., Keane, T., McCarthy, S. A., Davies, R. M., & Li, H. (2021). Twelve years of SAMtools and BCFtools. GigaScience, 10(2). 10.1093/gigascience/giab008
        • Lu, J., Breitwieser, F. P., Thielen, P., & Salzberg, S. L. (2017). Bracken: estimating species abundance in metagenomics data. PeerJ. Computer Science, 3(e104), e104. 10.7717/peerj-cs.104
        • Wood, D. E., Lu, J., & Langmead, B. (2019). Improved metagenomic analysis with Kraken 2. Genome Biology, 20(1), 257. 10.1186/s13059-019-1891-0
        • Ondov, B. D., Bergman, N. H., & Phillippy, A. M. (2011). Interactive metagenomic visualization in a Web browser. BMC Bioinformatics, 12(1), 385. 10.1186/1471-2105-12-385
        • Beber, M. E., Borry, M., Stamouli, S., & Fellows Yates, J. A. (2023). TAXPASTA: TAXonomic Profile Aggregation and STAndardisation. Journal of Open Source Software, 8(87), 5627. 10.21105/joss.05627
        • Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics , 32(19), 3047–3048. 10.1093/bioinformatics/btw354.
        Notes:
          (doi: 10.1101/2023.10.20.563221)
        • The command above does not include parameters contained in any configs or profiles that may have been used. Ensure the config file is also uploaded with your publication!
        • You should also cite all software used within this run. Check the "Software Versions" of this report to get version information.

        nf-core/taxprofiler Workflow Summary

        - this information is collected when the pipeline is started.URL: https://github.com/nf-core/taxprofiler

        Input/output options

        databases
        /nfs3/Sharpton_Lab/prod/prod_restructure/projects/sielerjm/ZF__MajorExp/Databases/Deworming/Nextflow/database_sheet.csv
        email
        sielerjm@oregonstate.edu
        input
        /nfs3/Sharpton_Lab/prod/prod_restructure/projects/sielerjm/ZF__MajorExp/Databases/Deworming/Nextflow/filtered_samplesheet.csv
        multiqc_title
        ROL_MajExp__taxprofiler__042225_1
        outdir
        /nfs3/Sharpton_Lab/prod/prod_restructure/projects/sielerjm/ZF__MajorExp/Transcriptomics/Results/taxprofiler

        Preprocessing general QC options

        skip_preprocessing_qc
        true

        Preprocessing host removal options

        hostremoval_reference
        /nfs3/Sharpton_Lab/prod/prod_restructure/projects/sielerjm/ZF__MajorExp/Databases/NCBI/Zebrafish/ZF_genome.fa
        perform_longread_hostremoval
        true
        perform_shortread_hostremoval
        true

        Profiling options

        bracken_save_intermediatekraken2
        true
        run_bracken
        true
        run_kraken2
        true

        Postprocessing and visualisation options

        run_krona
        true
        run_profile_standardisation
        true
        taxpasta_add_name
        true
        taxpasta_add_rank
        true
        taxpasta_ignore_errors
        true
        taxpasta_taxonomy_dir
        /nfs3/Sharpton_Lab/prod/prod_restructure/projects/sielerjm/ZF__MajorExp/Databases/Deworming/kraken2_custom_k2/taxonomy

        Generic options

        trace_report_suffix
        2025-04-22_23-28-42

        Core Nextflow options

        configFiles
        N/A
        containerEngine
        singularity
        launchDir
        /nfs3/Sharpton_Lab/prod/prod_restructure/projects/sielerjm/ZF__MajorExp/Databases/Deworming/Nextflow
        profile
        singularity
        projectDir
        /local/cqls/software/nextflow/assets/nf-core-taxprofiler_1.2.2/1_2_2
        runName
        modest_becquerel
        userName
        sielerjm
        workDir
        /nfs3/Sharpton_Lab/prod/prod_restructure/projects/sielerjm/ZF__MajorExp/Databases/Deworming/Nextflow/work